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Abstract: Natural soil-forming factors such as landforms, parent materials or biota lead to high variability in 
soil properties. However, there is not enough research quantifying which environmental factor(s) can be the 
most relevant to predicting soil properties at the catchment scale in semi-arid areas. Thus, this research aims 
to investigate the ability of multivariate statistical analyses to distinguish which soil properties follow a clear 
spatial pattern conditioned by specific environmental characteristics in a semi-arid region of Iran. To achieve 
this goal, we digitized parent materials and landforms by recent orthophotography. Also, we extracted ten 
topographical attributes and five remote sensing variables from a digital elevation model (DEM) and the 
Landsat Enhanced Thematic Mapper (ETM), respectively. These factors were contrasted for 334 soil samples 
(depth of 0—30 cm). Cluster analysis and soil maps reveal that Cluster 1 comprises of limestones, massive 
limestones and mixed deposits of conglomerates with low soil organic carbon (SOC) and clay contents, and 
Cluster 2 is composed of soils that originated from quaternary and early quaternary parent materials such as 
terraces, alluvial fans, lake deposits, and marls or conglomerates that register the highest SOC content and the 
lowest sand and silt contents. Further, it is confirmed that soils with the highest SOC and clay contents are 
located in wetlands, lagoons, alluvial fans and piedmonts, while soils with the lowest SOC and clay contents 
are located in dissected alluvial fans, eroded hills, rock outcrops and steep hills. 'The results of principal 
component analysis using the remote sensing data and topographical attributes identify five main components, 
which explain 73.3% of the total variability of soil properties. Environmental factors such as hillslope 
morphology and all of the remote sensing variables can largely explain SOC variability, but no significant 
correlation is found for soil texture and calcium carbonate equivalent contents. Therefore, we conclude that 
SOC can be considered as the best-predicted soil property in semi-arid regions. 
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1 Introduction 


In arid and semi-arid areas, vegetation patterns and pedogenesis are highly affected by extreme 
climate conditions, and also by the topography (Jia et al., 2017; Cerdà et al., 2018). Flat areas, such 
as alluvial fans, valleys or lagoons, that are highly exposed to high peaks of insolation, register high 
average temperatures with prolonged dry seasons that enhance high evapotranspiration rates (Li et 
al., 2016), the difficulties of organic matter consolidation (Wang et al., 2016), the generation of soil 
crusts (Singer and Shainberg, 2004) and the reduction of available water for the plants (Amare et al., 
2014). 

Specifically, in semi-arid landscapes, soil texture, organic carbon content and calcium carbonate 
content are able to reflect the spatial distributions of soil nutrients and vegetative cover (Shiri et al., 
2017; Keshavarzi et al., 2018). Soil particle distribution is considered as a key parameter of soil 
quality because soil clay content controls soil cation-exchange capacity (Khaledian et al., 2017a; 
Sulieman et al., 2018), soil aggregate stability (Schjenning et al., 2007), and soil organic carbon 
preservation (Zeraatpishe and Khormali, 2012). Moreover, soil silt is also highly related to soil 
erosion and is sensitive to soil particle detachment and transport processes (Ramos et al., 2000). Sand 
and clay particles determine water content in the field and wilting point capacities, depending on the 
carbonate and organic carbon contents (Wang et al., 2013). Moreover, soil organic carbon stores and 
mineral horizons also show a high spatial variation due to the characteristics of topography, slope 
gradient and curvature (Conforti et al., 2016), and the parent material and type of land use (Sun et 
al., 2015). Also, based significantly on parent material, landform distributions and hydrological 
conditions, several scholars have demonstrated that soil calcium carbonate is accumulated, 
transported and distributed (Khormali et al., 2009; Karchegani et al., 2011; Wilford et al., 2015). In 
regard to topographic position, Stavi et al. (2008) concluded that calcium carbonate content in soil 
profiles can be lower on the north-facing hillslope than on the south-facing hillslope in the Negev 
region of Israel because of the lower evaporation on the north-facing hillslope. However, there is a 
lack of information about whether the combination of soil texture, organic carbon and carbonate 
calcium follows common spatial patterns in the same landscape and is conditioned by topographical 
variables, parent materials and/or vegetation covers. 

Cluster analysis (CA) and principal component analysis (PCA) are considered as useful statistical 
techniques to assess the spatial relationships among soil variables and to obtain a group of factors to 
explain the variability of the soil system (Shukla et al., 2006). Both CA and PCA are frequently and 
respectively used as they are capable of detecting similarities among soil samples and/or variables 
(Tran et al., 2010; Ruggieri et al., 2011). However, they are rarely used in combination previously. 

Moreover, in order to check the obtained results from the multivariate statistical analyses with the 
spatial distribution of the soil properties, several researchers also highlighted the use of geostatistical 
analysis as the most useful tool (Gribov and Krivoruchko, 2012; Martinez-Murillo et al., 2017). 
Using remote sensing data and Geographic Information Systems (GISs), techniques such as inverse 
distance weighting (IDW), ordinary kriging, empirical Bayesian kriging and radial basis functions 
are widely used in soil sciences and other disciplines such as hydrology, geology and environmental 
sciences (Sagir and Kurtulus, 2017; Samsonova et al., 2017). 

The objective of this research is to investigate the ability of multivariate statistical analyses (CA 
and PCA) to distinguish which soil properties follow a clear spatial pattern conditioned by specific 
environmental characteristics, using the IDW interpolation method, in a semi-arid case study 
catchment located in central Iran. 


2 Materials and methods 


2.1 Study area 


The study area is located in the Borujen region, Chaharmahal-Va-Bakhtiari Province, central Iran. 
This area extends from 31?41'57" to 32?00'01"N latitude and from 51?02'45" to 51?19'09"E 
longitude (Fig. 1), and covers an area of approximately 860 km? with a mean elevation of about 2200 
m a.s.l. (Zeraatpisheh et al., 2017). 
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The climate conditions can be defined as semi-arid, with mean annual precipitation and annual 
mean temperature of 255 mm and 10.7?C, respectively. The main parent materials are limestones 
(Zeraatpisheh et al., 2017; 2019; Ayoubi et al., 2018). Soil moisture and temperature regimes are 
xeric and mesic, respectively, according to the soil taxonomy (Soil Survey Staff, 2014). The main 
land uses include irrigated wheat cropping, dryland farming, and pastures. Predominant crops are 
wheat, barley and alfalfa. In pasture land uses, the main plant species are milkvetch and Bromus 
tectorum. The major landscape units in the study area comprise mountains, hill-lands, piedmonts, 
and lowlands. 
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Fig. 1 Location of the study area in Chaharmahal-Va-Bakhtiari Province (a) and distributions of soil sampling 
points (black circles) in the study area based on Landsat ETM (colour composition; b) 


2.2 Soil sampling method and laboratory analysis 


Distributions of 334 soil sampling points acquired by conditioned Latin Hypercube Sampling (cLHS) 
(Minasny and McBratney, 2006) are shown in Figure 1. Soil sampling to depths of 0—30 cm was 
performed at each sampling point and Matlab software (Mathworks) was used to estimate the 
quantitative attributes from the topographical data and remote sensing data (Table 1). Three soil 
properties were analyzed: soil organic carbon (SOC), calcium carbonate equivalent (CCE) and soil 
texture (clay, silt, and sand). SOC was determined by a modified Walkley and Black method 
according to Nelson and Sommers (1996). CCE was measured using the method of Page et al. (1982) 
and soil particle size distribution was determined by the Bouyoucos hydrometer method (Bouyoucos, 
1962). 


2.3 Soil mapping, remote sensing data and topographical attributes 


SOC, CCE and soil texture (clay, silt, and sand) contents were represented in the form of maps using 
the inverse distance weighting (IDW) interpolation method with ArcMap 10.5 (ESRI, USA). By 
cross-validation, performance statistics of R? (coefficient of determination) and RMSE (root mean 
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square error) were calculated for IDW interpolation to validate which soil property is the best 
predicted. 

Geological units were extracted using the 1:100,000 Borujen Geology Map (1990) and land unit 
areas by air photointerpretation. Topographical attributes were obtained from a Digital Elevation 
Model (DEM) with a cell size of 30 m, which was provided by the ASTER GDEM database (ASTER 
GDEM, 2009). Remote sensing indices were derived from the Landsat Enhanced Thematic Mapper 
(ETM) acquired in 2008 (U.S. Geological Survey, 2004). Ten topographical attributes including 
elevation, topographic wetness index (TWI), SAGA (System for Automated Geoscientific Analysis) 
wetness index (WI), multi-resolution of ridge top flatness index (MrRTF), multi-resolution valley 
bottom flatness index (MrVBF), curvature (Cur), profile curvature (PrCu), plan curvature (PlCu), 
aspect (Asp) and slope (Slp) at the same resolution were calculated. Furthermore, five remote sensing 
auxiliary variables, namely the normalized difference vegetation index (NDVI), ratio vegetation 
index (RVI), perpendicular vegetation index (PVI), clay index (CI), and soil-adjusted vegetation 
index (SAVI) were also extracted. The SAGA GIS was employed to derive the environmental factors 
(Olaya, 2004) (Table 1). 


Table 1 Environmental covariates used in this study 


Environmental Nature of the soil variable 


factor derived. Name of factor Definition Type 
El Elevation (m) 
TWI Topographic wetness index 
WI Wetness index 
MrRTF Multi-resolution of ridge top flatness index 
MrVBF Multi-resolution valley bottom flatness index " 
Topography DEM Quantitative 
Cur Curvature 
PrCu Profile curvature 
PICu Plan curvature 
Asp Aspect (radians) 
Slp Slope (°) 
NDVI Normalized difference vegetation index 
RVI Ratio vegetation index 
Biogeography Landsat ETM PVI Perpendicular vegetation index Quantitative 
CI Clay index 
SAVI Soil adjusted vegetation index 
Landform Landscape and landform GEM Landforms (10 units) 
Geology map Geologic unit GEO A 2 s SEEN 


Note: DEM, Digital Elevation Model; ETM, Enhanced Thematic Mapper. 


2.4 Statistical analysis 


Descriptive statistics of soil properties, topographical attributes, and remote sensing data, including 
minimum, maximum, mean, standard deviation, variance, the coefficient of variation (CV), median, 
skewness and kurtosis, were determined. 

The CA was undertaken for the different geological and landform units, and average linkages 
between groups for dissimilarities in the variables were calculated. Then, hierarchical clustering 
series using Euclidean distance and squared Euclidean distance methods for geology and landform 
units were extracted from the dataset, respectively. These techniques have been applied to assess 
diverse natural source variations in soil properties and groups of soil types among similar clusters 
(Massart and Kaufman, 1983). 

Furthermore, the PCA was carried out to identify distribution and relationships of soil texture, 
SOC, and CCE with environmental factors. The PCA allowed grouping of similar variables into 
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dimensions, without distinguishing independent and dependent variables. Firstly, the Pearson's 
correlation was conducted to check the linear correlations among variables. Secondly, the variables 
were centred and normalized. Then, a Kaiser-Mayer-Olkin (KMO) test of sphericity was conducted 
and the coefficients and determinants were observed to verify the assumptions. The orthogonal 
rotation method (Varimax) and the correlation matrix using factors with eigenvalues >1 were 
performed. The variables used to achieve our main goal should model the phenomenon observed 
(soil properties, topographical attributes and remote sensing data) in the closest possible way, 
identifying specifically the most significant variables that condition soil distribution and formation 
(Malinowski, 2002; Shukla et al., 2006). Factor loading indicates the correlation between variable 
and underlying common factor. The highly loading variables can be used to propose a possible 
common underlying factor that links variables together within each factor. The signs of the factor 
loadings provide information on how these variables are related and when they are representing the 
common factors. 

Our dataset included twenty environmental factors, which were grouped as follows: (i) 
topographical attributes, namely El (elevation), Asp, TWI, WI, MrVBF, MrRTF, Slp, Cur, PlCu, and 
PrCu; (ii) remote sensing variables, namely NDVI, RVI, PVI, CI, and SAVI; and (iii) soil properties 
of soil particle distribution (clay, silt and sand), SOC content and CCE content. All statistical analyses 
were performed using SPSS 23.0 (IBM Corp, 2015, USA). 


3 Results 


3.1 Soil properties, topographical attributes and remote sensing data 


In Figures 2-4, soil properties, geology, and landforms in the study area are represented in the form 
of maps. Moreover, descriptive statistics of soil properties, topographical attributes, and remote 
sensing data are summarized in Table 2. Also, total areas of each geological and landform units are 
summarized in Tables 3 and 4, respectively. 
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Fig.2 Distributions of soil properties in the study area using inverse distance weighting (IDW) interpolation method. 
(a), soil organic carbon (SOC); (b), calcium carbonate equivalent (CCE); (c), sand content; (d), silt content; (e), clay 
content. 


201909.00010v1 


chinaXiv 


ChinaXiv& ERAT! 


JOURNAL OF ARID LAND 2019 Vol. 11 No. 4 


51?05'00"E 51°07'30"E 51?10'00"E 51?12'30"E 51?15'00"E 51°17'30"E 


E 
[0] 
G 
B 
a 


Jkimk 
Kld 
Kim! 
Kshgu 
Kt2 
Pleb 
//| Qat 
Qmc 


Qu 
Qt2 


E BÁBESBSE 


| ES 


Qi3 
SC 


77 
pod 
E 
m 


31?45'00"N  31?47'30"N 31°50'00"N 31°52'30"N 31°55'00"N 31°57'30"N 32°00'00"N 


7775 
VS 777 LLL LL ELL Z LZ ES 


51?05'00"E 5190730"E 51*10'00"E 51?12'30"E 51?15'00"E 5191730"E 


31?42:30"N  31?45'00"N 31?47'30"N  31?50'00"N 31?52':30"N 31?55'00"N 31°57'30"N  32?00'00"N 


31°42'30"N 


Fig. 3 Geological units in the study area. JkImk, mixed deposit of limestone, marl and shale; Kld, massive dark 
grey limestone, marl and shale; KIm1, marl and limestone; Kshgu, marl and rare sandstone; Kt2, dark grey massive 
limestone; Plcb, thick-bedded conglomerate with marl; Qat, river channel and recent alluvium; Qmc, grey marl, 
conglomerate and lake deposit; Qt1, older terrace and alluvial fan; Qt2, young terrace and alluvial fan; Qt3, silty clay 
flat; SC, silty clay flat (Zeraatpisheh et al., 2017). The abbreviations are the same as in Figure 5. 


Dark grey massive limestone and young terrace and alluvial fan units occupy 26.4% and 18.8% of 
the total area, respectively (Table 3). The recent and youngest parent materials are distributed in the 
low altitudes, mostly from the centre to the northeast of the study area. Moreover, the old parent 
materials are distributed in the high altitudes. As shown in Figure 4 and Table 4, a total of ten 
landform units are obtained, and the areas of Mol (rock outcrop) and Pil (alluvial fan) are the largest, 
with the total area percentages of 33.9% and 30.4%, respectively. Mol, Hil (eroded hill land), and 
Hi2 (developed hill land) landforms cover the southwest, the east and some part of the centre in the 
study area. Pil and Pi5 (piedmont plain) landforms mostly are located in the centre with the youngest 
parent materials. 

The highest SOC contents are observed from the centre to the southwest and the west of the study 
area (Fig. 2a), coinciding with P11 (wetland), P12 (lagoon), Pil and Pi4 (river plain) landform units 
(Fig. 4). However, the lowest SOC values are found in the northern, north-western and north-eastern 
parts, particularly along the Mol, Hil and Hi2 landform units (Fig. 4). 

From Figure 2b we can find that the lowest CCE values are located in the northern parts and there 
are only three small patches with values higher than 60.0%. Soils with the highest clay contents are 
in the lowest altitudes, which correspond to the P11 and P12 landform units (Figs. 2e and 4). 

Regarding descriptive statistics (Table 2), SOC content ranges from 0.3% to 2.9% (CV=58.1%). 
Related to soil texture, clay content ranges from 18.0% to 6.01% (CV=20.1%), silt from 20.0% to 
60.0% (CV=15.8%) and sand from 8.0% to 54.0% (CV=35.5%). Furthermore, CCE content shows 
values from 11.0% to 73.0% (CV=31.4%). The skewness values of CCE, clay and silt contents are 
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less than 0.54, and the kurtosis values are less than 0.88. It should be noted that SOC content shows 
the highest values of skewness and kurtosis among all soil properties. 
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Fig. 4 Landform units in the study area. P12, lake (lagoon); P11, wetland; Pi5, piedmont plain; Pi4, river plain; Pi3, 
pediment; Pi2, dissected alluvial fan; Pil, alluvial fan; Mo1, rock outcrop; Hi2, developed hill land; Hil, eroded hill 
land. The abbreviations are the same as in Figure 4. 


Table 5 provides the results of R? and RMSE indices to demonstrate which soil property registers 
the best validation using IDW. The results show that SOC registers the highest R? value (0.62) and 
the lowest RMSE value (0.30), while silt records the lowest R? value (0.10). 


3.2 Influences of parent materials and landform units on soil properties 


Figure 5 shows the dendrogram of geology units, where three highly differentiated clusters showing 
high similarities can be noted. From the top to the bottom of the dendrogram, the first group (Cluster 
1) contains the oldest parent materials (Cretaceous) composed of Kt2 (dark grey massive limestone), 
Plcb (thick-bedded conglomerate with marl), JkImk (mixed deposit of limestone, marl and shale), Kld 
(massive dark grey limestone, marl and shale), and KIm1 (marl and limestone). Cluster 2 is grouped 
with the quaternary parent materials of Qmc (grey marl, conglomerate and lake deposit), Qt2 (young 
terrace and alluvial fan), Qat (river channel and recent alluvium), Qt1 (older terrace and alluvial fan), 
Qt3 (silty clay flat), and SC (silty clay flat). Moreover, it consists of two sub-clusters that are separated 
by other two different geological formations of Qt3 and SC. Cluster 3 contains one geological unit 
characterized by Kshgu. We observe that Cluster 1 contains the majority of lithologies with a lower 
amount of SOC and clay contents compared with the surrounding areas. Cluster 2 contains the 
lithologies that comprise soils with the highest SOC content and the lowest sand and silt contents. 

A hierarchical CA is employed to detect dissimilarities among landform units. A dendrogram of 
landform units is generated to group the sampling points into two statistically significant clusters 
(Fig. 6). The first cluster contains soils that are situated on piedmonts, eroded hillslopes and lowland 
landscapes that comprise moderately to highly developed soils. The second cluster shows soils 
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Table2 Descriptive statistics of soil properties, topographical attributes and remote sensing variables 


Factor Min. Max. Mean SD Variance CV Median Ske. Kurtosis 
SOC (%) 0.30 2.90 0.90 0.50 0.30 58.10 0.70 2.13 4.94 
CCE (%) 11.00 73.00 32.50 10.20 104.00 31.40 32.00 0.54 0.63 
Clay (96) 18.00 61.00 37.90 7.60 58.00 20.10 38.00 0.11 0.47 
Silt (96) 20.00 60.00 40.20 6.30 40.20 15.80 40.00 —0.01 0.88 
Sand (96) 8.00 54.00 21.90 7.80 60.30 35.50 21.00 1.11 1.75 
El (m) 2147.90 2710.90 2268.00 74.90 5616.00 3.30 2250.00 1.89 5.63 
TWI 13.71 19.13 14.79 1.18 1.40 8.00 14.30 1.36 1.36 
WI 9.38 15.07 12.07 1.28 1.64 10.60 12.06 —0.05 -1.11 
MrVBF 0.00 5.95 1.81 1.54 2.38 85.46 1.39 0.77 —0.46 
MrRTF 0.00 3.98 0.76 0.90 0.82 119.28 0.34 1.45 1.22 
Cur —0.010 0.010 0.000 0.001 0.000 —488.350 0.000 —1.200 8.090 
m 0.54 5.88 3.19 1.38 1.90 43.13 3.29 -0.05 -1.00 
PlCu —0.002 0.002 0.000 0.001 0.000 869.400 0.000 —0.170 1.610 
PrCu —0.008 0.005 0.000 0.001 0.000 —321.610 0.000 —1.400 12.190 
Slp (°) 0.00 0.41 0.07 0.06 0.00 89.91 0.05 2.26 8.42 
NDVI -0.11 0.47 0.04 0.13 0.02 351.54 —0.01 1.22 0.80 
RVI —0.004 0.001 0.000 0.001 0.000 —401.220 0.000 —1.070 0.320 
PVI 6.61 135.60 73.49 30.07 904.45 40.92 78.95 —0.51 —0.64 
CI 1.11 1.85 1.24 0.13 0.02 10.12 1.19 1.85 3.90 
SAVI —0.17 0.70 0.05 0.19 0.04 351.54 —0.01 1.22 0.80 


Note: Min., minimum; Max., maximum; SD, standard deviation; CV, coefficient of variation; Ske., Skewness; SOC, soil organic carbon; 
CEE, calcium carbonate equivalent. The unit measurements of minimum, maximum, mean, SD, variance, CV and median are the same with 
the corresponding environmental factor, while skewness and kurtosis are dimensionless parameters. 


Table3 Characteristics of geological units in the study area 


Geology Age Geology formation Area(km?) Percentage of total area (96) 
SC Young quaternary Silty clay flat 9.6 1.1 
Qat Young quaternary River channel and recent alluvium 15.9 1.8 
Qt3 Quaternary Silty clay flat 151.0 17.5 
Qt2 Quaternary Young terrace and alluvial fan 162.1 18.8 
Qtl Quaternary Older terrace and alluvial fan 94.7 11.0 

Qmc Early quaternary Grey marl, conglomerate and lake deposit 12.3 1.4 
Plcb Pliocene Thick-bedded conglomerate with marl 57.4 6.7 
Kt2 Late Cretaceous Dark grey massive limestone 228.1 26.4 
Klml Late Cretaceous Marl and limestone 38.4 44 
Kshgu Late Cretaceous Marl and rare sandstone 40.1 4.6 
Jklmk Early Cretaceous Mixed deposit of limestone, marl and shale 24.3 2.8 
Kld Early Cretaceous Massive dark grey limestone, marl and shale 30.1 3.5 


Table 4 Characteristics of landform units in the study area 


Landform Area Percentage of total m 
Landscape Landform code (km?) area (%) Definition 
Hill Eroded hill land Hil 39.2 4.5 Continuous hills with a high altitude 
Developed hill , Active fan, cultivated plain in the 
land m d ds low altitudes 
Mountain Rock outcrop Mol 292.9 33.9 Eroded rock surface, shallow soils 
Piedmont Alluvial fan Pil 262.3 304 Dissected MM Modulating. Ted 
alluvial fan 
B Md Pi2 52.6 6.1 Shallow soil, colluvial material 
Pediment Pi3 79.6 9.2 Low drainage, young terraces 
River plain Pi4 57 0.7 terraces, moderate depth 
Piedmont plain Pi5 57.4 6.7 Low drainage, wetness 
Lowland Wetland Pll 3.0 0.4 Covered by hydrophilic plants 


Lake (Lagoon) P12 8.2 1.0 Single and low topography hills 
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Table5 Validation criteria for inverse distance weighting (IDW) interpolation method among soil properties 


Soil property R RMSE 
SOC 0.62 0.30 
CCE 0.35 8.25 
Clay 0.35 6.17 

Silt 0.10 6.27 
Sand 0.16 7.21 


Note: R?, coefficient of determination; RMSE, root mean square error. 


originated from Mol and Hi2 landform units that are mostly characterized by rock outcrops and 
eroded rock surfaces with shallow to very shallow soil depths. The CA allows us to distinguish 
landforms that register soils with a higher amount of SOC and clay contents (Cluster 1); meanwhile, 
Cluster 2 contains soils with lower SOC content values and clay contents. 
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Fig. 5 Cluster analysis of geology units in the study area 
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Fig. 6 Cluster analysis of landform units in the study area 


3.3 Correlations among soil properties, topographical attributes and remote sensing 
variables 

Pearson's correlation analysis between soil properties and environmental factors are presented in 
Table 6. Absolute correlation values higher than 0.40 are considered significant. In this context, a 
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significant correlation exits between SOC and MrVBF and between SOC and all the remote sensing 
variables. As a result, SOC can be considered as a potential soil indicator to confirm the vegetation 
development and abundance obtained from the remote sensing analysis. Clay and silt contents show 
significant negative correlations with sand content, with correlation coefficients of —0.63 and —0.43, 
respectively. Moreover, CCE does not show significant correlations with any soil property or 
environmental factor (topographical and remote sensing variables). The El shows slightly positive 
correlations with Slp (7=0.45, P«0.05) and TWI (r=0.41, P«0.05). TWI also shows negative 
correlations with MrRTF and MrVBF. Additionally, WI shows a positive correlation with MrVBF 
(7=0.52, P<0.05) and a negative one with Slp (r— —0.50, P<0.05). As expected, curvature also exhibits 
positive correlations with PlCu (7=0.68, P«0.05) and PrCu (7=0.94, P<0.05). Finally, it is interesting 
to observe that soil properties show the highest correlations with all the remote sensing variables, but 
surprisingly, NDVI does not show any significant correlation with some of them (Table 6). 


3.4 Principal component analysis (PCA) 


The results of PCA show five main components to explain the variability of the selected soil 
properties (Table 7). It shows that the first two components (PCA 1 and PCA 2) can explain more 
than 42.0% ofthe total variability. Then, when including the other three components, a total of 73.3% 
is explained. In Table 8, the obtained components are added and the factors are identified. The 
diagram of factors obtained to explain the geomorphological influence on soil properties is shown in 
Figure 7. The commonalities of twenty factors considered in this study indicate that these five 
components explain a large part of the variation in most of the measured factors (Table 8). 
Component 1 is able to explain about 27.3% of the total variance and it is highly contributed to by 
SOC and all ofthe remote sensing variables (Tables 7 and 8). Once again, it is remarkable to observe 
the strong relationships between SOC and remote sensing variables. Component 2 can explain about 
15.1% of the total variance, with significant contributions from the landform covariates related to the 
hillslope morphology and wetness indices such as El, TWI, WI, MrVBF, and Slp. Component 3 
explains about 14.596 of the total variance, being highly contributed to by the curvature and wetness 
indices such as TWI, WI, MrRTF, Cur, PlCu, and PrCu. Component 4 explains about 8.496 of the 
total variance and is dominantly contributed to by clay and sand contents. Lastly, component 5 
accounts for about 7.9% of the total variance, with significant contributions from CCE, and silt and 
sand contents. In general, we conclude that the first component 1s highly contributed to by remote 
sensing variables and SOC, and the second and third components by wetness indices and the 
morphology of the hillslope. Conversely, the fourth and fifth components primarily include soil 
properties and do not relate strongly with topographical attributes and remote sensing data (Fig. 7). 


4 Discussion 


Among all soil properties, SOC shows the highest CV value of 58.1%, which indicates a broad range 
of SOC values across the study area. Wilding (1985) categorized CV values into three classes of high 
variability (CV>35%), moderate variability (15%<CV<35%) and low variability (CV<15%). Also, 
the high skewness and kurtosis values of SOC confirm the vast distribution of SOC in the study area 
(Rosemary et al., 2017). Therefore, this implies the importance of knowing which kind of spatial 
pattern can SOC follow under different environmental conditions, and this needs to be further 
clarified and determined. The high variation of SOC could also be due to the land use practices. In 
semi-arid areas, agricultural activities can lead to a significant increase in SOC in comparison to bare 
lands. In China, for example, Liao et al. (2015) concluded that agricultural intensification increases 
crop productivity and crop residues, and also highly affects SOC values and carbon sequestration. 

The CVs for the other soil properties are more moderate, coinciding with the results in other semi- 
arid areas with non-active land use changes. Also, the predominance of parent materials can be 
another driving factor as other researchers confirmed (Anderson, 1988; Bruand and Tessier, 2000; 
Orgill et al., 2017). Therefore, the moderate variability of CCE content might be explained because 
all soils are formed on limestone and calcareous materials. It is important to highlight that high levels 
of CCE in the southwest of the study area could be affected by soil crusts, abandonment of 


ChinaXiv& (ERAT! 


Mojtaba ZERAATPISHEH et al.: Determining the spatial distribution of soil properties using the... 


001 
+960 
#56 0- 
«66 0- 
«001 
ITO- 

900 

100 
90 0- 

$00 

+10 
sco 

ETO 
TTo- 
c£ 0- 
61 0- 

+10 

800 

FTU 
«£90 
IAVS 


001 
+880- 
*$60- 
+960 
oTo- 

+00 

100 
t00- 

£00 
ETO 
6E 0 
+10 
tTO- 
6c0- 
$T0- 
cro 

600 

670 
+190 

I2 


001 
#960 
#56 0- 

STO 

c00- 

900 

900 

000 

L00- 

tEO- 

ETO- 

cro 

970 

tO 

STO- 

oro- 

LTO- 
+190- 

IAd 


001 
«66 0- 
Ico 
$0'0- 
100- 

100 
$0'0- 
tTO- 
sco- 
ETO- 
TTo 
ceo 
oco 
cro- 
I1 0- 
STO- 
«$9 '0- 
IAN 


001 
ITO- 
90'0 
T0'0 
90°0- 
s0'0 
+10 
80 
£10 
TTo- 
TEO- 
610- 
LAU 
800 
sco 
«£90 
IAGN 


001 
sTo- 
z700 
90'0 
61 0- 
TEO- 
*9'0- 
«0$ 0- 
«650 
«Sto 
$0'0 
+00 
I1 0- 
Hro- 
6c 0- 
dis 


Nid 


c0 0- 
T0 0 
9070- 
600 
900 
c0 0- 
nOld 


001 
$0'0- 
soo- 
£00- 
t00- 
oro 
£00 
€0°0- 
€£0'0 
0070 
$0'0- 
oro- 
dsy 


001 
ECO 
£00- 
SEO- 
teo- 
Iro- 
90°0- 
90°0- 
cro 
900 
£0'0 
m 


00'I 
610 
£0'0- 
«IF 0- 
LTO- 
100- 
ETO- 
sto 
sto 
400 


001 
«TSO 
*«£S0- 

6£0- 

tTO- 
£00- 

ETO 

Tg0 
«ISO 


001 

$00- 001 
ETO- ItO 
t00 Iro 
$00 900 
800- LTO- 
800  9c70- 
TTO LTO- 


JN dHANN IM IML 
səjqeuea 3ursues yowa pue sanqıye peorgde130do; *senredoud [ios 8uoure sisfeue uone[anroo suosieod 9 AQEL 


L^0L000'606L0c: A!Xeuiu» 


001 

oro ooi 

TOO Eto 

TTO- +990- 

ETO- Wo 

IEO A0- 
[Et pues 


001 
6€0- 
sTo- 
£10 
WS 


001 
100- 001 

ZOO +70 ool 
Áe|O 320 20S 


“PASI $0 Q»d I UBD gris uone[o102 *, 30N 


IAVS 


201909.00010v1 


chinaXiv 


ChinaXiv& f'ERBTIJ 


JOURNAL OF ARID LAND 


Table7 Explained variance of the first five components of principal component analysis (PCA) 


Component Eigenvalue Percentage of variance (96) Accumulated percentage of variance (96) 
1 5.466 27.3 27.3 
2 3.019 15.1 42.4 
3 2.908 14.5 57.0 
4 1.686 8.4 65.4 
5 1.575 7.9 73.3 


Table8  Eigenvectors of the selected variables for the first five components of PCA 


Component 
Factor 1 2 3 4 5 
SOC 0.677* 0.342 —0.021 0.077 —0.031 
CCE 0.257 0.231 0.051 0.217 0.597* 
Clay 0.041 0.095 0.077 —0.943* 0.263 
Silt 0.177 0.001 —0.026 0.143 —0.902* 
Sand —0.185 —0.094 —0.055 0.807* 0.478* 
El —0.252 —0.549* -0.117 0.076 —0.103 
TWI —0.109 —0.683* —0.401* 0.064 —0.119 
WI 0.060 0.605* —0.526* 0.137 —0.081 
MrVBF 0.296 0.807* —0.142 —0.025 0.082 
MrRTF 0.052 0.370 0.432* —0.141 0.225 
Cu 0.004 0.090 0.952" —0.009 —0.020 
As —0.064 —0.080 —0.084 —0.097 —0.080 
PICu —0.001 —0.084 0.785* 0.055 0.065 
PrCu 0.005 0.152 0.833” —0.037 —0.055 
SI —0.069 —0.888* —0.095 0.012 0.015 
NDVI 0.979* 0.132 0.036 —0.028 0.014 
RVI —0.979* —0.136 —0.033 0.047 —0.033 
PVI —0.954* —0.062 0.027 0.066 0.025 
CI 0.956* 0.135 0.017 —0.024 0.043 
SAVI 0.979* 0.132 0.036 —0.028 0.014 


Note: * means that values were obtained for each factor. 


Fig.7 Diagram of factors obtained to explain the geomorphological influence on soil properties. SOC, soil organic 
carbon; NDVI, normalized difference vegetation index; RVI, ratio vegetation index; PVI, perpendicular vegetation 
index; CI, clay index; SAVI, soil-adjusted vegetation index; El, elevation; TWI, topographic wetness index; WI, 
wetness index; MrVBF, multi-resolution valley bottom flatness index; MrRTF, multi-resolution of ridge top flatness 
index; Slp, slope; Cur, curvature; PlCu, plan curvature; PrCu, profile curvature; CEE, calcium carbonate equivalent. 
The upper arrows indicate the increasing trend while the down arrows indicate the decreasing trend. 
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agricultural activities, and water infiltration (Morin and Van Winkel, 1996; Castillo-Monroy et al., 
2010). The locations with high contents of CCE frequently coincide with areas being of low organic 
matter content and exhibit weak correlations with remote sensing data. It is well-known that 
carbonates can block organic matter development by blocking the nitrogen interchange (Wang et al., 
2016). 

However, there is not a common consensus whether parent material or landform could be the main 
driving factor to explain pedogenesis and vegetation development (Tajik et al., 2012; Mehnatkesh 
et al., 2013). Our research question also agrees with other studies conducted in semi-arid areas. For 
example, based on rainfall simulations in semi-arid regions in Spain, Martínez-Hernández et al. 
(2017) concluded that different parent materials define different hydrological processes at the pedon 
scale. Differences in parent materials such as marls, sandstones, colluvium and slates can impact on 
pedogenesis, but those above mentioned researches also remarked that Slp and antecedent soil 
moisture also play an important role. The results of CA for landform and geology units in the present 
study confirm that soil properties could be classified into similar groups (clusters). However, by 
using the qualitative assessment, we observe that soil properties do also coincide with the IDW maps. 
Thus, it is confirmed that soil mapping techniques and statistical analysis have to be combined 
(Rodrigo-Comino et al., 2018), but it should also be aware of the parent materials which, as found 
in our study, control the spatial variation of CCE contents. 

With the use of IDW maps, we confirm that this interpolation method is the best tool to get an 
overview of the studied area (Li and Heap, 2011). In this study, SOC registers the highest R? and 
the lowest RMSE values, confirming that this factor obtains similar conclusion as obtained from CA 
and PCA (high correlation with the spatial variables). However, we acknowledge that there is no 
assessment of prediction errors from which to produce what are known as "bulls eyes" or higher 
values near the observed location (http://desktop.arcgis.com/en/arcmap/10.3/tools/geostatistical- 
analyst-toolbox/idw.htm). To solve this problem, researchers such as Kravchenko and Bullock 
(1999) added the number of the closest neighbouring samples using cross-validation to compare the 
results obtained with a different number of the closest neighbouring samples. In future research 
studies, it should be interesting to test different interpolation methods such as ordinary kriging, 
empirical Bayesian kriging or radial basis functions to observe if other soil properties do or do not 
coincide with SOC as the best soil property predictor using multivariate statistical analyses. 

Several previous studies such as Dwivedi and Sreenivas (1998) and Metternicht and Zinck (2003) 
demonstrated that PCA based on the correlation matrix is an effective approach in discriminating 
soils in arid and semi-arid regions. In our study, results of PCA show that the first five components 
are able to describe more than 73.3% of the total variability in soil properties. SOC is highly 
correlated with remote sensing variables. We then conclude the same finding with above mentioned 
previous studies: an increase of vegetation cover would demonstrate a drastic improvement of SOC. 
The center to the west of the study area also contains clay to clay loam soils, which confirms that 
soils can be more fertile and capable of being colonized by natural plants or crops. Regarding the 
obtained clusters, high SOC contents coincide with Qt3 and SC lithology, and PI2, Pil and Pi4 
landforms. These findings demonstrate that parent materials and landform characteristics can show 
well-conserved SOC levels and vegetative covers as well (Taghizadeh-Mehrjardi et al., 2016). 
Consequently, lithology and landforms might be considered as indicators of soil fertility. Our 
findings are agreement with those of Khaledian et al. (2017b), who reported that PCA can be widely 
used to discriminate soil indicators taking into account spatial variables. 

However, in our study, among all the topographic attributes, only MrVBF is significantly 
correlated with SOC content. The low correlation of topographic attributes with SOC content could 
be ascribed to the low-relief differences (about 600 m), the extreme semi-arid climate conditions and 
the high amounts of CCE content. Furthermore, there are slight correlations among silt, sand, El, 
TWI, WI, MrVBF and all of the remote sensing variables. Alijani and Sarmadian (2014) also 
confirmed that, in the semi-arid and arid regions in Iran, CCE is positively correlated with some 
topographical indices such as WI and Cur. Therefore, it could be possible that these parameters can 
be used to model CCE at larger scales. 
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Based on the findings of this study, we demonstrate that the combination of thematic maps of 
geology and landforms, topographical attributes and remote sensing data with CA and PCA 
approaches are able to distinguish clear spatial patterns only for specific soil properties on specific 
parent materials and landforms. Therefore, further researches need to be done to find the key 
parameters and factors in other specific areas, which are essential to designing correct and relevant 
solutions and appropriate land management plans for sustainable development in semi-arid regions. 


5 Conclusions 


This study aims to investigate the ability of multivariate statistical analyses, in combination with 
topographical attributes and remote sensing data, to identify the spatial distribution of soil properties. 
Cluster analysis reveals that higher SOC and clay contents are related to flat areas characterized by 
young quaternary deposits and alluvial fans. Conversely, massive limestones and conglomerate 
deposits in eroded and consolidated hills and rock outcrops with steep slopes register the lower 
content of SOC and fine particles, and the higher CCE content. Results of PCA show that the first 
five components are able to describe more than 73.3% of the total variability in soil properties. 
Environmental factors such as hillslope morphology and all remote sensing variables can largely 
explain the variability of SOC, but no significant correlation is found for soil texture and CCE content. 
Therefore, we conclude that SOC can be the best-predicted soil property because it follows a specific 
spatial pattern in semi-arid regions conditioned by parent material, hillslope morphology and 
vegetative cover. 
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